Predicting Library of Congress classifications from Library of Congress subject headings

نویسندگان

  • Eibe Frank
  • Gordon W. Paynter
چکیده

This paper addresses the problem of automatically assigning a Library of Congress Classification (LCC) to a work given its set of Library of Congress Subject Headings (LCSH). LCCs are organized in a tree: The root node of this hierarchy comprises all possible topics, and leaf nodes correspond to the most specialized topic areas defined. We describe a procedure that, given a resource identified by its LCSH, automatically places that resource in the LCC hierarchy. The procedure uses machine learning techniques and training data from a large library catalog to learn a model that maps from sets of LCSH to classifications from the LCC tree. We present empirical results for our technique showing its accuracy on an independent collection of 50,000 LCSH/LCC pairs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting Library of Congress Classifications From Library of

This paper addresses the problem of automatically assigning a Library of Congress Classification (LCC) to a work given its set of Library of Congress Subject Headings (LCSH). LCC are organized in a tree: the root node of this hierarchy comprises all possible topics, and leaf nodes correspond to the most specialized topic areas defined. We describe a procedure that, given a resource identified b...

متن کامل

Library of Congress Classification as linked data

In 2009 and in 2011, the Library of Congress made two of its largestauthority files –Subject Headings and Names available as linked data via LC’slinked data service, id.loc.gov. Both are offered in MADS/RDF and SKOS. It isLC’s objective, in 2012, to publish another of its largest authority files as linkeddata: LC Classification. However, whereas the source records for Subject He...

متن کامل

User tags versus expert-assigned subject terms: A comparison of LibraryThing tags and Library of Congress Subject Headings

Social tagging, as a recent approach for creating metadata, has caught the attention of library and information science researchers. Many researchers recommend incorporating social tagging into the library environment and combining folksonomies with formal classification. However, some researchers are concerned with the quality issues of social annotation because of its uncontrolled nature. In ...

متن کامل

A Gray Code Based Ordering for Documents on Shelves: Classification for Browsing and Retrieval Journal of the American Society for Information

A document classifier places documents together in a linear arrangement for browsing or high speed access by human or computerized information retrieval systems. Requirements for document classification and browsing systems are developed from similarity measures, distance measures, and the notion of subject aboutness. A requirement that documents be arranged in decreasing order of similarity as...

متن کامل

Knowledge Representation, Learning, and Reasoning in WebDoc -- A Web Document Classification System

This paper describe a novel approach to knowledge representation, learning, and reasoning in WebDoc, a system that classifies Web documents according to the Library of Congress classification system. We argue that an automatically constructed domain-independent knowledge base is indispensable. The WebDoc system builds a knowledge base (represented as a semantic network) that contains the Librar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JASIST

دوره 55  شماره 

صفحات  -

تاریخ انتشار 2004